Search results for "Combinatorics on word"
showing 10 items of 55 documents
Block Sorting-Based Transformations on Words: Beyond the Magic BWT
2018
The Burrows-Wheeler Transform (BWT) is a word transformation introduced in 1994 for Data Compression and later results have contributed to make it a fundamental tool for the design of self-indexing compressed data structures. The Alternating Burrows-Wheeler Transform (ABWT) is a more recent transformation, studied in the context of Combinatorics on Words, that works in a similar way, using an alternating lexicographical order instead of the usual one. In this paper we study a more general class of block sorting-based transformations. The transformations in this new class prove to be interesting combinatorial tools that offer new research perspectives. In particular, we show that all the tra…
Bacteria classification using minimal absent words
2017
Bacteria classification has been deeply investigated with different tools for many purposes, such as early diagnosis, metagenomics, phylogenetics. Classification methods based on ribosomal DNA sequences are considered a reference in this area. We present a new classificatier for bacteria species based on a dissimilarity measure of purely combinatorial nature. This measure is based on the notion of Minimal Absent Words, a combinatorial definition that recently found applications in bioinformatics. We can therefore incorporate this measure into a probabilistic neural network in order to classify bacteria species. Our approach is motivated by the fact that there is a vast literature on the com…
DNA combinatorial messages and Epigenomics: The case of chromatin organization and nucleosome occupancy in eukaryotic genomes
2019
Abstract Epigenomics is the study of modifications on the genetic material of a cell that do not depend on changes in the DNA sequence, since those latter involve specific proteins around which DNA wraps. The end result is that Epigenomic changes have a fundamental role in the proper working of each cell in Eukaryotic organisms. A particularly important part of Epigenomics concentrates on the study of chromatin, that is, a fiber composed of a DNA-protein complex and very characterizing of Eukaryotes. Understanding how chromatin is assembled and how it changes is fundamental for Biology. In more than thirty years of research in this area, Mathematics and Theoretical Computer Science have gai…
Quasi-linear time computation of the abelian periods of a word
2012
Computing abelian periods in words
2011
International audience
A NEW COMPLEXITY FUNCTION FOR WORDS BASED ON PERIODICITY
2013
Motivated by the extension of the critical factorization theorem to infinite words, we study the (local) periodicity function, i.e. the function that, for any position in a word, gives the size of the shortest square centered in that position. We prove that this function characterizes any binary word up to exchange of letters. We then introduce a new complexity function for words (the periodicity complexity) that, for any position in the word, gives the average value of the periodicity function up to that position. The new complexity function is independent from the other commonly used complexity measures as, for instance, the factor complexity. Indeed, whereas any infinite word with bound…
On Balancing of a Direct Product
2009
A direct product of two sequences is a naturally defined sequence on the alphabet of pairs of symbols. By taking inspiration from [Pavel Salimov. On uniform recurrence of a direct product. In AutoMathA, 2009], where the author investigates the case of uniformly recurrent words, here, we study when the product of two balanced sequences on binary alphabet is also balanced.
Languages with mismatches
2007
AbstractIn this paper we study some combinatorial properties of a class of languages that represent sets of words occurring in a text S up to some errors. More precisely, we consider sets of words that occur in a text S with k mismatches in any window of size r. The study of this class of languages mainly focuses both on a parameter, called repetition index, and on the set of the minimal forbidden words of the language of factors of S with errors. The repetition index of a string S is defined as the smallest integer such that all strings of this length occur at most in a unique position of the text S up to errors. We prove that there is a strong relation between the repetition index of S an…
Burrows-Wheeler transform and palindromic richness
2009
AbstractThe investigation of the extremal case of the Burrows–Wheeler transform leads to study the words w over an ordered alphabet A={a1,a2,…,ak}, with a1<a2<⋯<ak, such that bwt(w) is of the form aknkak−1nk−1⋯a2n2a1n1, for some non-negative integers n1,n2,…,nk. A characterization of these words in the case |A|=2 has been given in [Sabrina Mantaci, Antonio Restivo, Marinella Sciortino, Burrows-Wheeler transform and Sturmian words, Information Processing Letters 86 (2003) 241–246], where it is proved that they correspond to the powers of conjugates of standard words. The case |A|=3 has been settled in [Jamie Simpson, Simon J. Puglisi, Words with simple Burrows-Wheeler transforms, Electronic …
Bounded Bi-ideals and Linear Recurrence
2013
Bounded bi-ideals are a subclass of uniformly recurrent words. We introduce the notion of completely bounded bi-ideals by imposing a restriction on their generating base sequences. We prove that a bounded bi-ideal is linearly recurrent if and only if it is completely bounded.